Search CORE

1,779 research outputs found

Self-Supervised Disentanglement of Harmonic and Rhythmic Features in Music Audio Signals

Author: Wu Yiming
Publication venue
Publication date: 06/09/2023
Field of study

The aim of latent variable disentanglement is to infer the multiple informative latent representations that lie behind a data generation process and is a key factor in controllable data generation. In this paper, we propose a deep neural network-based self-supervised learning method to infer the disentangled rhythmic and harmonic representations behind music audio generation. We train a variational autoencoder that generates an audio mel-spectrogram from two latent features representing the rhythmic and harmonic content. In the training phase, the variational autoencoder is trained to reconstruct the input mel-spectrogram given its pitch-shifted version. At each forward computation in the training phase, a vector rotation operation is applied to one of the latent features, assuming that the dimensions of the feature vectors are related to pitch intervals. Therefore, in the trained variational autoencoder, the rotated latent feature represents the pitch-related information of the mel-spectrogram, and the unrotated latent feature represents the pitch-invariant information, i.e., the rhythmic content. The proposed method was evaluated using a predictor-based disentanglement metric on the learned features. Furthermore, we demonstrate its application to the automatic generation of music remixes.Comment: Accepted to DAFx 202

arXiv.org e-Print Archive

Re-configurable Mechatronic Platform

Author: Wu Yiming
Publication venue: Digital WPI
Publication date: 20/09/2011
Field of study

To meet the increasing need of the multi-disciplinary engineering education and to provide a re-configurable mechatronic experiment platform, the team seeks to plan, design, and validate a mechatronic platform that allows simple model re-assembling and re-configuration. This platform also employs the concept of modular and expandable design. It consists of re-configurable mechanical structures, diverse sensor applications, microcontroller and motor controller control system, and graphical user interfaces on PC terminal for multi-disciplinary learning experience

DigitalCommons@WPI

音楽音響信号に対する自動コード推定のための生成・識別統合的アプローチ

Author: Wu Yiming
Publication venue: 京都大学
Publication date: 24/09/2021
Field of study

京都大学新制・課程博士博士(情報学)甲第23540号情博第770号新制||情||131(附属図書館)京都大学大学院情報学研究科知能情報学専攻(主査)准教授吉井和佳, 教授河原達也, 教授西野恒, 教授鹿島久嗣学位規則第4条第1項該当Doctor of InformaticsKyoto UniversityDFA

Kyoto University Research Information Repository